Mining frequent sequential patterns under regular expressions: a highly adaptative strategy for pushing constraints∗
نویسندگان
چکیده
This paper introduces a new framework for the extraction of frequent sequences satisfying a given regular expression (RE) constraint. Contrary to previous work (SPIRIT algorithms), we represent REs by tree structures and our algorithm can choose dynamically an extraction method according to the local selectivity of the sub-REs. Interestingly, pruning can rely not only on the anti-monotonic minimal frequency constraint but also to the RE constraint that is generally not anti-monotonic. Preliminary experiments on synthetic data have shown that our algorithm takes the shape of the best algorithm from the SPIRIT family and even surpasses it.
منابع مشابه
Mining Frequent Sequential Patterns under Regular Expressions: A Highly Adaptive Strategy for Pushing Contraints
This paper introduces a new framework for the extraction of frequent sequences satisfying a given regular expression (RE) constraint. Contrary to previous work (SPIRIT algorithms), we represent REs by tree structures and our algorithm can choose dynamically an extraction method according to the local selectivity of the sub-REs. Interestingly, pruning can rely not only on the anti-monotonic mini...
متن کاملFirst-Order Temporal Pattern Mining with Regular Expression Constraints
Previous studies on mining sequential patterns have focused on temporal patterns specified by some form of propositional temporal logic. However, there are some interesting sequential patterns, such as the multi-sequential patterns, whose specification needs a more expressive formalism, the first-order temporal logic. In this paper, we extend a well-known user-controlled tool, based on regular ...
متن کاملSPIRIT: Sequential Pattern Mining with Regular Expression Constraints
Discovering sequential patterns is an important problem in data mining with a host of application domains including medicine, telecommunications, and the World Wide Web. Conventional mining systems provide users with only a very restricted mechanism (based on minimum support) for specifying patterns of interest. In this paper, we propose the use of Regular Expressions (REs) as a flexible constr...
متن کاملMining Sequential Patterns with Regular Expression Constraints
ÐDiscovering sequential patterns is an important problem in data mining with a host of application domains including medicine, telecommunications, and the World Wide Web. Conventional sequential pattern mining systems provide users with only a very restricted mechanism (based on minimum support) for specifying patterns of interest. As a consequence, the pattern mining process is typically chara...
متن کاملPushing Constraints to Generate Top-K Closed Sequential Graph Patterns
In this paper, the problem of finding sequential patterns from graph databases is investigated. Two serious issues dealt in this paper are efficiency and effectiveness of mining algorithm. A huge volume of sequential patterns has been generated out of which most of them are uninteresting. The users have to go through a large number of patterns to find interesting results. In order to improve th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003